# EN3240: Embedded Systems Engineering Assignment 5 — Research Paper Review

Name: B. P. Thalagala Name: M.G.S.Thanujaya Name: M.W.K.S.Jayalath Name: R.M.A.S.Rathnayake Name: G.R.U.Y.Gamlath Index No: 180631J Index No: 180634V Index No: 180260U Index No: 180534N Index No: 180191H

October 2, 2022

This is a group assignment!

Due Date: 2 October 2022 by 11.59 PM

## Instructions

This assignment has two main components.

- 1. Paper review (5 points) to be done using this template
- 2. Presentation (5 points)

Please read the paper assigned to you and write a review of the paper using the template given below. Prepare a separate PowerPoint presentation and present the paper within 10 minutes. Record a video of the presentation. Each member of your team must present (approximately 2 minutes each for a group of 5). Use any PowerPoint/LaTeX template you are comfortable with. Use any format you think is suitable to explain the paper.

Upload both the paper review and presentation video in one zip file to Moodle.

Paper Title: Fine-Grained Dynamic Voltage and Frequency Scaling for Precise Energy and Performance Tradeoff Based on the Ratio of Off-Chip Access to On-Chip Computation Times

Author(s): Kihwan Choi, Student Member, IEEE, Ramakrishna Soma, Student Member, IEEE, and Massoud Pedram, Fellow, IEEE

Conference/Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and

Systems

Year of Publication: 2005

#### Overall merit

- Reject
- Weak reject
- Weak accept
- Accept  $\Leftarrow$
- Strong accept

## Reviewer expertise

- No familiarity
- Some familiarity  $\Leftarrow$
- Knowledgeable
- Expert

## Paper Summary

Dynamic Voltage and Frequency Scaling (DVFS) is an energy optimization method for Embedded Computing. It dynamically scales the supply voltage level of the Central Processing Unit (CPU) by providing enough circuit speed to process the system workload. Authors have proposed a fine-grained DVFS solution for precise energy and performance trade-off, which is based on the ratio of off-chip access to on-chip computation time. However, calculating this ratio statically is difficult due to the unpredictable dynamic behaviour of microprocessors. Therefore, the authors have proposed a dynamic approach to compute the mentioned ratio using parameters obtained through the Performance Monitoring Unit (PMU) which is commonly available in the present microprocessors. They then used those parameters of the current time unit to predict the required target frequency of the next time unit using a linear regression technique. However, in reality, the mentioned ratio of off-chip access to on-chip computation time varies significantly even within a single time unit. The paper has also proposed the necessary modifications required to take that variation into account to produce more fine-grained DVFS. In addition to that, the authors have also introduced hardware implementation of the proposed policy on a high-performance XScale-based testbed, which realizes the solutions they have proposed. Hardware implementation results showcase that, in CPU-bound applications, actual performance values were very close to the target value but in memory bound applications that difference was considerable. Moreover, using the proposed method, 70% CPU energy saving with about 12% performance degradation for memory-bound programs and 15% to 60% energy savings with fine-tuned performance degradation ranging from 5% to 20% for CPU-bound programs, have been obtained.

#### Comments for Author

In general quality of the content and the flow of the paper are well maintained. When it comes to the abstract, the authors have been really successful in summarizing their proposed method, key results and main conclusion. The proposed method to obtain precise energy and performance trade-off based on the

ratio of off-chip access to on-chip computation time (as mentioned in the paper title) has been given the focus and comprehensively and clearly explained with the appropriate figures and relevant equations. This also ensures the relevance of the title of the paper and its content. The tables and quality of the figures of the overall paper are satisfactory. Moreover, it is evident that the authors have surveyed the literature adequately according to the presented prior works. The methodology of workload partitioning and scaling the CPU frequency is clear in Section "III. PERFORMANCE-ENERGY TRADEOFFS". It would be great if the concepts in the subsection "C. Scaling Granularity" of that section is explained more as the sentences are a bit more confusing when it comes to understanding their meaning. In Equation 4,  $W^3_{offchip}$  in the denominator seems to be changed to  $W^2_{offchip}$ . If the explanation in the Section "V. IMPLEMENTATION", is started from kernel space and directed towards the voltage controlling, the flow could be seen easily. It gives the sense that it is explained in reverse order. Experimental results were clearly described using appropriate figures. It would be better if the authors could compare their results with similar prior work. However, the project has a significant contribution toward the DVFS technology as it presents one of the first actual implementations of an intraprocess DVFS policy that exploits dynamic events at runtime.